-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: fix pytest errors on higher versions of PyTorch #697
base: master
Are you sure you want to change the base?
Conversation
@@ -132,7 +132,11 @@ | |||
|
|||
import torch | |||
import torch.cuda | |||
from torch._six import string_classes | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder does torch remove this torch._six
in higher versions? do you know which version?
It's better for us to remove these dependencies on private modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wangraying torch._six has been removed from PyTorch 2.0.0.
@@ -44,6 +48,7 @@ | |||
) | |||
|
|||
import bagua.torch_api.data_parallel.functional as bagua_dist | |||
from bagua.torch_api.contrib.sync_batchnorm import _SYNC_BN_V5, _SYNC_BN_V6, _SYNC_BN_V7 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should couple this test with sync bn together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wangraying This approach aims to make the test cases compatible with multiple versions of PyTorch.
Do you have any other suggestions for addressing the compatibility issues of the test cases with multiple versions of PyTorch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree using LooseVersion**
as the condition, but it's kind of confusing to import the condition from sync bn.
@@ -29,3 +29,98 @@ jobs: | |||
run: | | |||
rm -rf bagua bagua_core | |||
pytest -s --timeout=300 --timeout_method=thread | |||
build-ten: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why adding so many steps on different containers here, rather than changing above **pytorch-1.9.0**
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wangraying This is just to test if it can run properly on different versions of PyTorch. If this PR needs to be merged, the final testing environment will either PyTorch 1.9.0 or PyTorch 1.13.0.
If the test case will to be changed to PyTorch 1.13.0 version later, we can still keep the PyTorch 1.9.0 testing environment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked pytorch lightning's ci, their pytest cpu test supporting multiple torch versions, but gpu test mainly running on latest pytorch release. Maybe we could do similarly, or just maintaining all ci tests on pytorch 1.13?
No description provided.